FunAndes – A functional trait database of Andean plants

We introduce the FunAndes database, a compilation of functional trait data for the Andean flora spanning six countries. FunAndes contains data on 24 traits across 2,694 taxa, for a total of 105,466 entries. The database features plant-morphological attributes including growth form, and leaf, stem, and wood traits measured at the species or individual level, together with geographic metadata (i.e., coordinates and elevation). FunAndes follows the field names, trait descriptions and units of measurement of the TRY database. It is currently available in open access in the FIGSHARE data repository, and will be part of TRY’s next release. Open access trait data from Andean plants will contribute to ecological research in the region, the most species rich terrestrial biodiversity hotspot.


Background & Summary
Functional traits are measurable properties of a plant describing its structure, function or life history strategy that determine species responses to biotic and abiotic environmental conditions across scales of biological complexity, from communities to ecosystems [1][2][3][4] . Exploring variation in plant functional traits provides key insights into plant species distribution, community assembly mechanisms, evolutionary strategies, and ecosystem level potential responses to global environmental change [5][6][7][8][9][10][11][12][13] . Global databases of plant functional traits currently feature an unprecedented amount of trait information that supports scientific work on plant functional ecology, including BIEN 14 , GIFT 15 , and TRY 16,17 . Yet, the geographical coverage of trait measurements still remains limited for highly diverse tropical areas, especially in mountainous regions 15,16 .
The tropical Andes is a major hotspot of global biodiversity and endemism. With about 2% of the terrestrial area of the planet, it holds 10% of the species of vascular plants [18][19][20] . However, trait information for Andean plants is underrepresented in global plant trait databases. These information gap limits our understanding of variation in plant trait composition and diversity at regional, continental, and global scales. Synthesizing and harmonizing trait measurements from remote and understudied areas is critical for global and regional data archiving initiatives 21 , and for advancing empirical biodiversity research. Here, we present the FunAndes database, a compilation of plant functional traits in the tropical Andes (Fig. 1). The records in FunAndes stem from 18 unpublished datasets contributed by different research groups conducting fieldwork in the region. FunAndes follows the structure and terminology of the TRY database, and is available in the FIGSHARE data repository 22 . In total, FunAndes contains 105,466 records of 24 traits, covering 2,694 Andean (morpho-) species in 670 genera and 175 families. Assembling FunAndes encompassed the following steps: 1) developing a TRY-based format for data contributors, 2) revising comparability among protocols used for trait data collection, 3) checking trait measurement units for each contributed dataset, 4) detecting and deleting suspicious or erroneous trait measurements, 5) compiling the contributed data into a unique source with common taxonomic names, units, and terminology. To our knowledge, FunAndes is the first open access trait database of the Andean flora, filling a substantial gap in global functional trait data. We hope that providing a standardized and curated database on Andean plant traits will encourage plant trait ecological research in Andean ecosystems, as well as comparative studies across tropical regions. # A full list of authors and their affiliations appears at the end of the paper.

DATA DEScRIpTOR
OpEN (2022) 9:511 | https://doi.org/10.1038/s41597-022-01626-6 www.nature.com/scientificdata www.nature.com/scientificdata/ Methods primary sources. We first developed a basic data template containing trait names, trait descriptions and units of measurement, together with information (e.g., site coordinates and collection dates, number of samples collected). This template was distributed to potential data contributors, scientists collecting vascular plant functional trait data mainly in tropical forests of the Andean region. Filled templates were returned to the writing team, and FunAndes was assembled from 18 distinct datasets containing field data of Andean plant traits (Tables 1 and 2).
Trait definitions and protocols. Trait definitions and trait units of measurement in FunAndes follow those of the TRY database, for a total of 24 plant traits, two categorical and 22 numerical (Table 3). All trait data contributed to FunAndes were obtained from individuals growing in natural vegetation, following standard and comparable methods 23,24 . Furthermore, traits were measured mostly in adult individuals, never in seedlings or saplings. Leaf traits were quantified from exposed mature leaves in the plant canopy. A summary of trait geographical representation in FunAndes is presented in Fig. 1. A comparison between trait data in FunAndes and TRY version 5 17 is presented in Table 4.  www.nature.com/scientificdata www.nature.com/scientificdata/ Database structure. The database contains 24 fields to provide contextual information about data collection, including association of trait data to permanent vegetation plots, site coordinates and collection dates; and information about the trait value provided (e.g., if the value provided is a single observation or an average of trait measurements) ( Table 5).
Harmonization. We followed various steps to ensure the quality of the data before adding a contributed dataset to FunAndes. Our workflow consisted of a series of operations, including generating dataset IDs for   www.nature.com/scientificdata www.nature.com/scientificdata/ each contributed dataset, harmonizing data into common measurement units, translating terms (trait values) for categorical variables, verifying and correcting collection coordinates, and identifying erroneous trait data measurements. Each data contributor was contacted to double check methods used for trait collection, correct or eliminate suspicious trait values. Finally, duplicates were removed to create the final version of the database. All steps taken toward data standardization were done in R 16 using built-in functions and the package 'dplyr' 25 .

Taxonomy. Species names standardization was conducted with the R package 'LCVP' of The Leipzig
Catalogue of Vascular Plants 18 . Original species names were compared to LCVP names by searching for matches. Non-matches (mainly caused by incorrect spelling) were revised by an expert in Andean flora (J.H.), and corrected following LCVP. The final FunAndes database reports both the original and the updated taxon name alongside each trait record. For each morphospecies, higher taxonomic affiliations obtained from the LCVP were included.

Data Records
Access. FunAndes database is stored and available for direct download from the FIGSHARE data repository 22 and will become available from the TRY Plant Trait Database in the next release (https://www.try-db.org).  (Table 4). Leaf trait data make up 67.7% of the database, followed by whole plant (i.e., plant growth form and leaf compoundness) (17.8 and 17.6%, respectively) and stem traits (14.5%). Each species has an average of 7.4 (SD = 5.1) distinct traits. All observations have geographic coordinates.
Considering the Andean countries, Ecuador has 47.8% of all the trait observations in FunAndes, followed by Peru (25.0%) and Bolivia (19.5%) (Fig. 1, Table 1). Data in FunAndes comes from 788 collection sites (i.e., unique combinations of latitude and longitude) and is associated to 570 forest plots. Furthermore, trait  www.nature.com/scientificdata www.nature.com/scientificdata/ observations are grouped mainly around 500, 1,000, 2,000 and 3,000 m of elevation (Fig. 2a). The data is widely distributed along a gradient of mean annual temperature, but clustered toward lower values of total mean annual precipitation (Fig. 2b).
The five most represented plant functional traits in FunAndes -plant growth form, leaf compoundness, specific leaf area (SLA), wood density, leaf thickness -are homogeneously distributed in the tree phylogeny (Fig. 3).
TRY version 5 23 hosts 8,548 entries for Andean plants, corresponding to 1,123 species, and 15 of the 24 functional traits held in FunAndes (Table 4). FunAndes, therefore, will increase available trait data by a factor of 12, and at least double the current representation of traits per species in TRY. In consequence, FunAndes is a substantial contribution to plant functional trait data availability for the Andean region.   www.nature.com/scientificdata www.nature.com/scientificdata/

technical Validation
For each contributed dataset we visually inspected all data and metadata producing histograms of each trait value to identify outliers or mistaken measures. In most cases, extreme values were discussed with data contributors to make decisions toward correcting or eliminating erroneous observations. With the final version of the database, histograms were produced once again to check for outliers or mistaken values.

Usage Notes
The data can be downloaded from the FIGSHARE data repository under the terms of Creative Commons Zero (CC0) waiver. We also provide FunAndes database in the TRY Plant Trait Database (https://www.try-db.org). Users of FunAndes data are invited to cite this publication: Báez et al. xx. FunAndes -A functional trait database of Andean plants. Scientific Data. 00:00-00, and the accompanying FIGSHARE dataset 22 .

code availability
The contributed datasets were provided in Excel spreadsheets (Microsoft Office 2013), therefore no code is available for this step. Scripts to conduct taxonomic standardization using the LCVP, to plot environmental distribution, and trait representation in the plant phylogeny are available at FIGSHARE 22 . The scripts were developed in R.